99 research outputs found

    Decision Stacks: Flexible Reinforcement Learning via Modular Generative Models

    Full text link
    Reinforcement learning presents an attractive paradigm to reason about several distinct aspects of sequential decision making, such as specifying complex goals, planning future observations and actions, and critiquing their utilities. However, the combined integration of these capabilities poses competing algorithmic challenges in retaining maximal expressivity while allowing for flexibility in modeling choices for efficient learning and inference. We present Decision Stacks, a generative framework that decomposes goal-conditioned policy agents into 3 generative modules. These modules simulate the temporal evolution of observations, rewards, and actions via independent generative models that can be learned in parallel via teacher forcing. Our framework guarantees both expressivity and flexibility in designing individual modules to account for key factors such as architectural bias, optimization objective and dynamics, transferrability across domains, and inference speed. Our empirical results demonstrate the effectiveness of Decision Stacks for offline policy optimization for several MDP and POMDP environments, outperforming existing methods and enabling flexible generative decision making.Comment: published at NeurIPS 2023, project page: https://siyan-zhao.github.io/decision-stacks

    Group Preference Optimization: Few-Shot Alignment of Large Language Models

    Full text link
    Many applications of large language models (LLMs), ranging from chatbots to creative writing, require nuanced subjective judgments that can differ significantly across different groups. Existing alignment algorithms can be expensive to align for each group, requiring prohibitive amounts of group-specific preference data and computation for real-world use cases. We introduce Group Preference Optimization (GPO), an alignment framework that steers language models to preferences of individual groups in a few-shot manner. In GPO, we augment the base LLM with an independent transformer module trained to predict the preferences of a group for the LLM generations. For few-shot learning, we parameterize this module as an in-context autoregressive transformer and train it via meta-learning on several groups. We empirically validate the efficacy of GPO through rigorous evaluations using LLMs with varied sizes on three human opinion adaptation tasks. These tasks involve adapting to the preferences of US demographic groups, global countries, and individual users. Our results demonstrate that GPO not only aligns models more accurately but also requires fewer group-specific preferences, and less training and inference computing resources, outperforming existing strategies such as in-context steering and fine-tuning methods.Comment: 24 pages, 12 figure

    Mapping spatial and temporal distribution information of plantations in Guangxi from 2000 to 2020

    Get PDF
    Plantations are formed entirely by artificial planting which are different from natural forests. The rapid expansion of plantation forestry has brought about a series of ecological and environmental problems. Timely and accurate information on the distribution of plantation resources and continuous monitoring of the dynamic changes in plantations are of great significance. However, plantations have similar spectral and texture characteristics with natural forests. In addition, cloud and rain greatly affected the image quality of large area mapping. Here, we tested the possibility of applying Continuous Change Detection and Classification to distinguish plantations from natural forests and described the spatiotemporal dynamic changes of plantations. We adopted the Continuous Change Detection and Classification algorithm and used all available Landsat images from 2000 to 2020 to map annual plantation forest distribution in Guangxi Zhuang Autonomous Region, China and analyzed their spatial and temporal dynamic changes. The overall accuracy of the plantation extraction is 88.77%. Plantations in Guangxi increased significantly in the past 20 years, from 2.37 × 106 ha to 5.11 × 106 ha. Guangxi is expanding new plantation land every year, with the largest expansion area in 2009 of about 2.58 × 105 ha. Over the past 20 years, plantations in Guangxi have clearly shown a tendency to expand from the southeast to the northwest, transformed from natural forests and farmland. 30% of plantations have experienced at least one logging-and-replanting rotation event. Logging rotation events more intensively occur in areas with dense plantation forests. Our study proves that using fitting coefficients from Continuous Change Detection and Classification algorithm is effective to extract plantations and mitigating the adverse effects of clouds and rain on optical images in a large scale, which provides a fast and effective method for long-time and large-area plantation identification and spatiotemporal distribution information extraction, and strong data support and decision reference for plantation investigation, monitoring and management

    Treatment of visual axis opacification and secondary membranes with Nd:YAG laser after pediatric cataract surgery under intranasal sedation

    Get PDF
    PurposeTo describe neodymium-doped yttrium-aluminum-garnet (Nd:YAG) laser treatment of visual axis opacification and secondary membranes in pediatric patients with cataracts under intranasal dexmedetomidine sedation.MethodsTwenty eyes of 17 patients with secondary membrane formation after cataract extraction were enrolled in this study. Intranasal dexmedetomidine sedation (3 ug/kg) was administered, and Nd:YAG laser (Ellex Super Q, Adelaide, Australia) procedures were performed with children in the sitting position with their chin supported on a laser delivery slit lamp. Preoperative and postoperative visual acuities were documented, and medical records were reviewed.ResultsThe age of the patients ranged from 5 to 83 months (31.82 ± 27.73). Nineteen (95.0%) eyes had congenital cataracts and one (5.0%) had a traumatic cataract. Nd:YAG laser treatment of VAO with ten (50.0%) eyes, pupillary membranes with three (15.0%) eyes, pupillary cortical proliferation with six (30.0%) eyes, and anterior capsule contraction with one (5.0%) eye. Five (25.0%) eyes demonstrated visual acuity improvement, whereas six (30.0%) eyes remained unchanged after laser treatment. The recurrence rate was 30.0% and four eyes underwent a second Nd:YAG membranectomy. No side effects or tolerances due to sedative drugs were observed.ConclusionNd:YAG laser membranectomy under intranasal dexmedetomidine sedation was safely performed in children as young as 5 months old in a sitting position. This approach facilitates patient convenience, doctor proficiency, and cost reductions. Patients with recurrence can be treated by repeating the procedure

    TRY plant trait database – enhanced coverage and open access

    Get PDF
    Plant traits - the morphological, anatomical, physiological, biochemical and phenological characteristics of plants - determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait‐based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits - almost complete coverage for ‘plant growth form’. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait–environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    Feeling Stories: Enriching Story Listening Experience for Children with Haptic Feedback

    No full text
    Story listening contributes greatly to children’s development, and much research has explored different modalities of story listening in children. However, few studies have utilized haptic sensory input. In the present 2-session study, we implemented a haptic vest that generates haptic sensations on a child’s back as he or she listens to stories. In the first session, child participants rated how realistic a sensation was in comparison to a language phrase. In the second session, the same child participants listened to two stories with the vest on. At the end of each story, the child participants retold the story and then answered 9 comprehension questions. At the end of the session, the child participants indicated which one of the two stories they liked better. Our study showed that children of 6 years old had the ability to associate haptic sensations with semantic meanings. They could also distinguish between sensations with congruent semantic meanings and those with incongruent semantic meanings. From the second session of the study, we observed an increase in children’s comprehension of the stories when they felt story-relevant sensations from the vest. However, this was only true for the 5- and 6-year olds. We did not find a similar effect in child participants’ retelling task. These results confirmed that 6-year olds had developed an association between haptics and semantics. In addition, sensations that were storyrelated improved children’s performance on story comprehension. We showed that haptic sensory input could improve story processing for children
    corecore